123 research outputs found
Tradeoffs for nearest neighbors on the sphere
We consider tradeoffs between the query and update complexities for the
(approximate) nearest neighbor problem on the sphere, extending the recent
spherical filters to sparse regimes and generalizing the scheme and analysis to
account for different tradeoffs. In a nutshell, for the sparse regime the
tradeoff between the query complexity and update complexity
for data sets of size is given by the following equation in
terms of the approximation factor and the exponents and :
For small , minimizing the time for updates leads to a linear
space complexity at the cost of a query time complexity .
Balancing the query and update costs leads to optimal complexities
, matching bounds from [Andoni-Razenshteyn, 2015] and [Dubiner,
IEEE-TIT'10] and matching the asymptotic complexities of [Andoni-Razenshteyn,
STOC'15] and [Andoni-Indyk-Laarhoven-Razenshteyn-Schmidt, NIPS'15]. A
subpolynomial query time complexity can be achieved at the cost of a
space complexity of the order , matching the bound
of [Andoni-Indyk-Patrascu, FOCS'06] and
[Panigrahy-Talwar-Wieder, FOCS'10] and improving upon results of
[Indyk-Motwani, STOC'98] and [Kushilevitz-Ostrovsky-Rabani, STOC'98].
For large , minimizing the update complexity results in a query complexity
of , improving upon the related exponent for large of
[Kapralov, PODS'15] by a factor , and matching the bound
of [Panigrahy-Talwar-Wieder, FOCS'08]. Balancing the costs leads to optimal
complexities , while a minimum query time complexity can be
achieved with update complexity , improving upon the
previous best exponents of Kapralov by a factor .Comment: 16 pages, 1 table, 2 figures. Mostly subsumed by arXiv:1608.03580
[cs.DS] (along with arXiv:1605.02701 [cs.DS]
Faster tuple lattice sieving using spherical locality-sensitive filters
To overcome the large memory requirement of classical lattice sieving
algorithms for solving hard lattice problems, Bai-Laarhoven-Stehl\'{e} [ANTS
2016] studied tuple lattice sieving, where tuples instead of pairs of lattice
vectors are combined to form shorter vectors. Herold-Kirshanova [PKC 2017]
recently improved upon their results for arbitrary tuple sizes, for example
showing that a triple sieve can solve the shortest vector problem (SVP) in
dimension in time , using a technique similar to
locality-sensitive hashing for finding nearest neighbors.
In this work, we generalize the spherical locality-sensitive filters of
Becker-Ducas-Gama-Laarhoven [SODA 2016] to obtain space-time tradeoffs for near
neighbor searching on dense data sets, and we apply these techniques to tuple
lattice sieving to obtain even better time complexities. For instance, our
triple sieve heuristically solves SVP in time . For
practical sieves based on Micciancio-Voulgaris' GaussSieve [SODA 2010], this
shows that a triple sieve uses less space and less time than the current best
near-linear space double sieve.Comment: 12 pages + references, 2 figures. Subsumed/merged into Cryptology
ePrint Archive 2017/228, available at https://ia.cr/2017/122
Dynamic Traitor Tracing Schemes, Revisited
We revisit recent results from the area of collusion-resistant traitor
tracing, and show how they can be combined and improved to obtain more
efficient dynamic traitor tracing schemes. In particular, we show how the
dynamic Tardos scheme of Laarhoven et al. can be combined with the optimized
score functions of Oosterwijk et al. to trace coalitions much faster. If the
attack strategy is known, in many cases the order of the code length goes down
from quadratic to linear in the number of colluders, while if the attack is not
known, we show how the interleaving defense may be used to catch all colluders
about twice as fast as in the dynamic Tardos scheme. Some of these results also
apply to the static traitor tracing setting where the attack strategy is known
in advance, and to group testing.Comment: 7 pages, 1 figure (6 subfigures), 1 tabl
Asymptotics of Fingerprinting and Group Testing: Capacity-Achieving Log-Likelihood Decoders
We study the large-coalition asymptotics of fingerprinting and group testing,
and derive explicit decoders that provably achieve capacity for many of the
considered models. We do this both for simple decoders (fast but suboptimal)
and for joint decoders (slow but optimal), and both for informed and uninformed
settings.
For fingerprinting, we show that if the pirate strategy is known, the
Neyman-Pearson-based log-likelihood decoders provably achieve capacity,
regardless of the strategy. The decoder built against the interleaving attack
is further shown to be a universal decoder, able to deal with arbitrary attacks
and achieving the uninformed capacity. This universal decoder is shown to be
closely related to the Lagrange-optimized decoder of Oosterwijk et al. and the
empirical mutual information decoder of Moulin. Joint decoders are also
proposed, and we conjecture that these also achieve the corresponding joint
capacities.
For group testing, the simple decoder for the classical model is shown to be
more efficient than the one of Chan et al. and it provably achieves the simple
group testing capacity. For generalizations of this model such as noisy group
testing, the resulting simple decoders also achieve the corresponding simple
capacities.Comment: 14 pages, 2 figure
Efficient Probabilistic Group Testing Based on Traitor Tracing
Inspired by recent results from collusion-resistant traitor tracing, we
provide a framework for constructing efficient probabilistic group testing
schemes. In the traditional group testing model, our scheme asymptotically
requires T ~ 2 K ln N tests to find (with high probability) the correct set of
K defectives out of N items. The framework is also applied to several noisy
group testing and threshold group testing models, often leading to improvements
over previously known results, but we emphasize that this framework can be
applied to other variants of the classical model as well, both in adaptive and
in non-adaptive settings.Comment: 8 pages, 3 figures, 1 tabl
Asymptotics of Fingerprinting and Group Testing: Tight Bounds from Channel Capacities
In this work we consider the large-coalition asymptotics of various
fingerprinting and group testing games, and derive explicit expressions for the
capacities for each of these models. We do this both for simple decoders (fast
but suboptimal) and for joint decoders (slow but optimal).
For fingerprinting, we show that if the pirate strategy is known, the
capacity often decreases linearly with the number of colluders, instead of
quadratically as in the uninformed fingerprinting game. For many attacks the
joint capacity is further shown to be strictly higher than the simple capacity.
For group testing, we improve upon known results about the joint capacities,
and derive new explicit asymptotics for the simple capacities. These show that
existing simple group testing algorithms are suboptimal, and that simple
decoders cannot asymptotically be as efficient as joint decoders. For the
traditional group testing model, we show that the gap between the simple and
joint capacities is a factor 1.44 for large numbers of defectives.Comment: 14 pages, 6 figure
Capacities and Capacity-Achieving Decoders for Various Fingerprinting Games
Combining an information-theoretic approach to fingerprinting with a more
constructive, statistical approach, we derive new results on the fingerprinting
capacities for various informed settings, as well as new log-likelihood
decoders with provable code lengths that asymptotically match these capacities.
The simple decoder built against the interleaving attack is further shown to
achieve the simple capacity for unknown attacks, and is argued to be an
improved version of the recently proposed decoder of Oosterwijk et al. With
this new universal decoder, cut-offs on the bias distribution function can
finally be dismissed.
Besides the application of these results to fingerprinting, a direct
consequence of our results to group testing is that (i) a simple decoder
asymptotically requires a factor 1.44 more tests to find defectives than a
joint decoder, and (ii) the simple decoder presented in this paper provably
achieves this bound.Comment: 13 pages, 2 figure
The Collatz conjecture and De Bruijn graphs
We study variants of the well-known Collatz graph, by considering the action
of the 3n+1 function on congruence classes. For moduli equal to powers of 2,
these graphs are shown to be isomorphic to binary De Bruijn graphs. Unlike the
Collatz graph, these graphs are very structured, and have several interesting
properties. We then look at a natural generalization of these finite graphs to
the 2-adic integers, and show that the isomorphism between these infinite
graphs is exactly the conjugacy map previously studied by Bernstein and
Lagarias. Finally, we show that for generalizations of the 3n+1 function, we
get similar relations with 2-adic and p-adic De Bruijn graphs.Comment: 9 pages, 8 figure
Discrete Distributions in the Tardos Scheme, Revisited
The Tardos scheme is a well-known traitor tracing scheme to protect
copyrighted content against collusion attacks. The original scheme contained
some suboptimal design choices, such as the score function and the distribution
function used for generating the biases. Skoric et al. previously showed that a
symbol-symmetric score function leads to shorter codes, while Nuida et al.
obtained the optimal distribution functions for arbitrary coalition sizes.
Later, Nuida et al. showed that combining these results leads to even shorter
codes when the coalition size is small. We extend their analysis to the case of
large coalitions and prove that these optimal distributions converge to the
arcsine distribution, thus showing that the arcsine distribution is
asymptotically optimal in the symmetric Tardos scheme. We also present a new,
practical alternative to the discrete distributions of Nuida et al. and give a
comparison of the estimated lengths of the fingerprinting codes for each of
these distributions.Comment: 5 pages, 2 figure
Hypercube LSH for Approximate near Neighbors
A celebrated technique for finding near neighbors for the angular distance involves using a set of random hyperplanes to partition the space into hash regions [Charikar, STOC 2002]. Experiments later showed that using a set of orthogonal hyperplanes, thereby partitioning the space into the Voronoi regions induced by a hypercube, leads to even better results [Terasawa and Tanaka, WADS 2007]. However, no theoretical explanation for this improvement was ever given, and it remained unclear how the resulting hypercube hash method scales in high dimensions.
In this work, we provide explicit asymptotics for the collision probabilities when using hypercubes to partition the space. For instance, two near-orthogonal vectors are expected to collide with probability (1/pi)^d in dimension d, compared to (1/2)^d when using random hyperplanes. Vectors at angle pi/3 collide with probability (sqrt[3]/pi)^d, compared to (2/3)^d for random hyperplanes, and near-parallel vectors collide with similar asymptotic probabilities in both cases.
For c-approximate nearest neighbor searching, this translates to a decrease in the exponent rho of locality-sensitive hashing (LSH) methods of a factor up to log2(pi) ~ 1.652 compared to hyperplane LSH. For c = 2, we obtain rho ~ 0.302 for hypercube LSH, improving upon the rho ~ 0.377 for hyperplane LSH. We further describe how to use hypercube LSH in practice, and we consider an example application in the area of lattice algorithms
- β¦